{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WfGpInivS0fG"
   },
   "source": [
    "<h2 align=\"center\">点击下列图标在线运行HanLP</h2>\n",
    "<div align=\"center\">\n",
    "\t<a href=\"https://colab.research.google.com/github/hankcs/HanLP/blob/doc-zh/plugins/hanlp_demo/hanlp_demo/zh/sdp_restful.ipynb\" target=\"_blank\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\t<a href=\"https://mybinder.org/v2/gh/hankcs/HanLP/doc-zh?filepath=plugins%2Fhanlp_demo%2Fhanlp_demo%2Fzh%2Fsdp_restful.ipynb\" target=\"_blank\"><img src=\"https://mybinder.org/badge_logo.svg\" alt=\"Open In Binder\"/></a>\n",
    "</div>\n",
    "\n",
    "## 安装"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IYwV-UkNNzFp"
   },
   "source": [
    "无论是Windows、Linux还是macOS，HanLP的安装只需一句话搞定："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "1Uf_u7ddMhUt"
   },
   "outputs": [],
   "source": [
    "!pip install hanlp_restful -U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pp-1KqEOOJ4t"
   },
   "source": [
    "## 创建客户端"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "id": "0tmKBu7sNAXX"
   },
   "outputs": [],
   "source": [
    "from hanlp_restful import HanLPClient\n",
    "HanLP = HanLPClient('https://www.hanlp.com/api', auth=None, language='zh') # auth不填则匿名，zh中文，mul多语种"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EmZDmLn9aGxG"
   },
   "source": [
    "#### 申请秘钥\n",
    "由于服务器算力有限，匿名用户每分钟限2次调用。如果你需要更多调用次数，[建议申请免费公益API秘钥auth](https://bbs.hanlp.com/t/hanlp2-1-restful-api/53)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "elA_UyssOut_"
   },
   "source": [
    "## 语义依存分析\n",
    "任务越少，速度越快。如指定仅执行语义依存分析："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 70
    },
    "id": "BqEmDMGGOtk3",
    "outputId": "2a0d392f-b99a-4a18-fc7f-754e2abe2e34"
   },
   "outputs": [],
   "source": [
    "doc = HanLP('2021年HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。', tasks='sdp')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "返回值为一个[Document](https://hanlp.hankcs.com/docs/api/common/document.html):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"tok/fine\": [\n",
      "    [\"2021年\", \"HanLPv2.1\", \"为\", \"生产\", \"环境\", \"带来\", \"次\", \"世代\", \"最\", \"先进\", \"的\", \"多\", \"语种\", \"NLP\", \"技术\", \"。\"]\n",
      "  ],\n",
      "  \"sdp\": [\n",
      "    [[[6, \"Time\"]], [[6, \"Agt\"]], [[5, \"mPrep\"]], [[5, \"Desc\"]], [[6, \"Datv\"]], [[0, \"Root\"]], [[8, \"Qp\"]], [[15, \"TDur\"]], [[10, \"mDegr\"]], [[15, \"Desc\"]], [[10, \"mAux\"]], [[13, \"Quan\"]], [[15, \"Desc\"]], [[15, \"Nmod\"]], [[6, \"Cont\"]], [[6, \"mPunc\"]]]\n",
      "  ]\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "print(doc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`doc['sdp']`字段代表语义依存图的数组格式，数组中第`i`个子数组代表第`i`个单词的语义依存关系，子数组中每个二元组的格式为`[中心词的下标, 与中心词的语义依存关系]`。每个单词的语义依存关系可能有零个、一个或多个（任意数量）。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "转换为[CoNLLSentence](https://hanlp.hankcs.com/docs/api/common/conll.html#hanlp_common.conll.CoNLLSentence)格式更容易观察："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\t2021年\t_\t_\t_\t_\t_\t_\t6:Time\t_\n",
      "2\tHanLPv2.1\t_\t_\t_\t_\t_\t_\t6:Agt\t_\n",
      "3\t为\t_\t_\t_\t_\t_\t_\t5:mPrep\t_\n",
      "4\t生产\t_\t_\t_\t_\t_\t_\t5:Desc\t_\n",
      "5\t环境\t_\t_\t_\t_\t_\t_\t6:Datv\t_\n",
      "6\t带来\t_\t_\t_\t_\t_\t_\t0:Root\t_\n",
      "7\t次\t_\t_\t_\t_\t_\t_\t8:Qp\t_\n",
      "8\t世代\t_\t_\t_\t_\t_\t_\t15:TDur\t_\n",
      "9\t最\t_\t_\t_\t_\t_\t_\t10:mDegr\t_\n",
      "10\t先进\t_\t_\t_\t_\t_\t_\t15:Desc\t_\n",
      "11\t的\t_\t_\t_\t_\t_\t_\t10:mAux\t_\n",
      "12\t多\t_\t_\t_\t_\t_\t_\t13:Quan\t_\n",
      "13\t语种\t_\t_\t_\t_\t_\t_\t15:Desc\t_\n",
      "14\tNLP\t_\t_\t_\t_\t_\t_\t15:Nmod\t_\n",
      "15\t技术\t_\t_\t_\t_\t_\t_\t6:Cont\t_\n",
      "16\t。\t_\t_\t_\t_\t_\t_\t6:mPunc\t_\n"
     ]
    }
   ],
   "source": [
    "print(doc.to_conll())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XOsWkOqQfzlr"
   },
   "source": [
    "为已分词的句子执行语义依存分析："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 70
    },
    "id": "bLZSTbv_f3OA",
    "outputId": "111c0be9-bac6-4eee-d5bd-a972ffc34844"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\tHanLP\t_\t_\t_\t_\t_\t_\t5:Agt\t_\n",
      "2\t为\t_\t_\t_\t_\t_\t_\t4:mPrep\t_\n",
      "3\t生产\t_\t_\t_\t_\t_\t_\t4:Desc\t_\n",
      "4\t环境\t_\t_\t_\t_\t_\t_\t5:Datv\t_\n",
      "5\t带来\t_\t_\t_\t_\t_\t_\t0:Root\t_\n",
      "6\t次世代\t_\t_\t_\t_\t_\t_\t12:Time\t_\n",
      "7\t最\t_\t_\t_\t_\t_\t_\t8:mDegr\t_\n",
      "8\t先进\t_\t_\t_\t_\t_\t_\t12:Desc\t_\n",
      "9\t的\t_\t_\t_\t_\t_\t_\t8:mAux\t_\n",
      "10\t多语种\t_\t_\t_\t_\t_\t_\t12:Desc\t_\n",
      "11\tNLP\t_\t_\t_\t_\t_\t_\t12:Nmod\t_\n",
      "12\t技术\t_\t_\t_\t_\t_\t_\t5:Cont\t_\n",
      "13\t。\t_\t_\t_\t_\t_\t_\t5:mPunc\t_\n",
      "\n",
      "1\t我\t_\t_\t_\t_\t_\t_\t3:Poss\t_\n",
      "2\t的\t_\t_\t_\t_\t_\t_\t1:mAux\t_\n",
      "3\t希望\t_\t_\t_\t_\t_\t_\t0:Root|4:Exp\t_\n",
      "4\t是\t_\t_\t_\t_\t_\t_\t5:mMod\t_\n",
      "5\t希望\t_\t_\t_\t_\t_\t_\t4:dClas\t_\n",
      "6\t张晚霞\t_\t_\t_\t_\t_\t_\t8:Poss\t_\n",
      "7\t的\t_\t_\t_\t_\t_\t_\t6:mAux\t_\n",
      "8\t背影\t_\t_\t_\t_\t_\t_\t11:Pat\t_\n",
      "9\t被\t_\t_\t_\t_\t_\t_\t10:mPrep\t_\n",
      "10\t晚霞\t_\t_\t_\t_\t_\t_\t11:Exp\t_\n",
      "11\t映红\t_\t_\t_\t_\t_\t_\t5:dCont\t_\n",
      "12\t。\t_\t_\t_\t_\t_\t_\t5:mPunc\t_\n"
     ]
    }
   ],
   "source": [
    "print(HanLP(tokens=[\n",
    "    [\"HanLP\", \"为\", \"生产\", \"环境\", \"带来\", \"次世代\", \"最\", \"先进\", \"的\", \"多语种\", \"NLP\", \"技术\", \"。\"],\n",
    "    [\"我\", \"的\", \"希望\", \"是\", \"希望\", \"张晚霞\", \"的\", \"背影\", \"被\", \"晚霞\", \"映红\", \"。\"]\n",
    "  ], tasks='sdp').to_conll())"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "sdp_restful.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
